Computationally Efficient Gaussian Process Changepoint Detection and Regression

نویسندگان

Robert Conlin Grande

Jonathan P. How

Richard C. Maclaurin

Paulo C. Lozano

چکیده

Most existing GP regression algorithms assume a single generative model, leading to poor performance when data are nonstationary, i.e. generated from multiple switching processes. Existing methods for GP regression over non-stationary data include clustering and changepoint detection algorithms. However, these methods require significant computation, do not come with provable guarantees on correctness and speed, and most algorithms only work in batch settings. This thesis presents an efficient online GP framework, GP-NBC, that leverages the generalized likelihood ratio test to detect changepoints and learn multiple Gaussian Process models from streaming data. Furthermore, GP-NBC can quickly recognize and reuse previously seen models. The algorithm is shown to be theoretically sample efficient in terms of limiting mistaken predictions. Our empirical results on two real-world datasets and one synthetic dataset show GP-NBC outperforms state of the art methods for nonstationary regression in terms of regression error and computational efficiency. The second part of the thesis introduces a Reinforcement Learning (RL) algorithm, UCRL-GP-CPD, for multi-task Reinforcement Learning when the reward function is nonstationary. First, a novel algorithm UCRL-GP is introduced for stationary reward functions. Then, UCRL-GP is combined with GP-NBC to create UCRL-GP-CPD, which is an algorithm for nonstationary reward functions. Unlike previous work in the literature, UCRL-GP-CPD does not make distributional assumptions about task generation, does not assume changepoint times are known, and does not assume that all tasks have been experienced a priori in a training phase. It is proven that UCRL-GP-CPD is sample efficient in the stationary case, will detect changepoints in the environment with high probability, and is theoretically guaranteed to prevent negative transfer. UCRL-GP-CPD is demonstrated empirically on a variety of simulated and real domains. Thesis Supervisor: Jonathan P. How Title: Richard C. Maclaurin Professor of Aeronautics and Astronautics

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Computationally Efficient Gaussian Process

متن کامل

Changepoint Analysis for Efficient Variant Calling

We present CAGe, a statistical algorithm which exploits high sequence identity between sampled genomes and a reference assembly to streamline the variant calling process. Using a combination of changepoint detection, classification, and online variant detection, CAGe is able to call simple variants quickly and accurately on the 90-95% of a sampled genome which differs little from the reference,...

متن کامل

THE UNIVERSITY OF BRITISH COLUMBIA DEPARTMENT OF STATISTICS TECHNICAL REPORT #236 On-line Changepoint Detection and Parameter Estimation for Genome-wide Transcript Analysis

We consider the problem of identifying novel RNA transcripts using tiling arrays. Standard approaches to this problem rely on the calculation of a sliding window statistic or on simple changepoint models. These methods suffer from several drawbacks including the need to determine a threshold to label transcript regions and/or specify the number of transcripts. In this paper, we propose a Bayesi...

متن کامل

Potentially Predictive Variance Reducing Subsample Locations in Local Gaussian Process Regression

Gaussian process models are commonly used as emulators for computer experiments. However, developing a Gaussian process emulator can be computationally prohibitive when the number of experimental samples is even moderately large. Local Gaussian process approximation (Gramacy and Apley, 2015) has been proposed as an accurate and computationally feasible emulation technique. Constructing sub-desi...

متن کامل

TO APPEAR IN SPECIAL ISSUE: ADVANCES IN KERNEL-BASED LEARNING FOR SIGNAL PROCESSING IN THE IEEE SIGNAL PROCESSING MAGAZINE 1 Spatio-Temporal Learning via Infinite-Dimensional Bayesian Filtering and Smoothing

Gaussian process based machine learning is a powerful Bayesian paradigm for non-parametric non-linear regression and classification. In this paper, we discuss connections of Gaussian process regression with Kalman filtering, and present methods for converting spatio-temporal Gaussian process regression problems into infinite-dimensional state space models. This formulation allows for use of com...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Computationally Efficient Gaussian Process Changepoint Detection and Regression

نویسندگان

چکیده

منابع مشابه

Computationally Efficient Gaussian Process

Changepoint Analysis for Efficient Variant Calling

THE UNIVERSITY OF BRITISH COLUMBIA DEPARTMENT OF STATISTICS TECHNICAL REPORT #236 On-line Changepoint Detection and Parameter Estimation for Genome-wide Transcript Analysis

Potentially Predictive Variance Reducing Subsample Locations in Local Gaussian Process Regression

TO APPEAR IN SPECIAL ISSUE: ADVANCES IN KERNEL-BASED LEARNING FOR SIGNAL PROCESSING IN THE IEEE SIGNAL PROCESSING MAGAZINE 1 Spatio-Temporal Learning via Infinite-Dimensional Bayesian Filtering and Smoothing

عنوان ژورنال:

اشتراک گذاری